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Abstract 

This paper considers the problem of optimal recovery of an element m of a Hilbert space % 
from measurements of the form ij{u), j = 1,... ,m, where the £j are known linear functionals on 
Ti. Problems of this type are well studied m and usually are carried out under an assumption 
that u belongs to a prescribed model class, typically a known compact subset of H. Motivated 
by reduced modeling for solving parametric partial differential equations, this paper considers 
another setting where the additional information about u is in the form of how well u can be 
approximated by a certain known subspace W of R of dimension n, or more generally, in the form 
of how well u can be approximated by each of a sequence of nested subspaces Vb C G • • • C 14 
with each 14 of dimension k. A recovery algorithm for the one-space formulation was proposed 
in |16j . Their algorithm is proven, in the present paper, to be optimal. It is also shown how 
the recovery problem for the one-space problem, has a simple formulation, if certain favorable 
bases are chosen to represent 14 and the measurements. The major contribution of the present 
paper is to analyze the multi-space case. It is shown that, in this multi-space case, the set 
of all u that satisfy the given information can be described as the intersection of a family of 
known ellipsoids in 'H. It follows that a near optimal recovery algorithm in the multi-space 
problem is provided by identifying any point in this intersection. It is easy to see that the 
accuracy of recovery of u in the multi-space setting can be much better than in the one-space 
problems. Two iterative algorithms based on alternating projections are proposed for recovery 
in the multi-space problem and one of them is analyzed in detail. This analysis includes an a 
posteriori estimate for the performance of the iterates. These a posteriori estimates can serve 
both as a stopping criteria in the algorithm and also as a method to derive convergence rates. 

Since the limit of the algorithm is a point in the intersection of the aforementioned ellipsoids, 
it provides a near optimal recovery for u. 
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1 Introduction 


1.1 Background and motivation 

The emergence of computational and experimental engineering has led to a spectrum of new mathe¬ 
matical questions on how to best merge data driven and model based approaches. The development 
of corresponding data-assimilation methodologies has been originally driven mainly by meteorologi¬ 
cal research (see e.g. da HU) but has meanwhile entered numerous areas in science and engineering 
bringing, in particular, the role of reduced order modeling into the focus of attention [!]• 

The present paper addresses some principal mathematical aspects that arise when trying to 
numerically capture a function u which is a state of a physical process with a known law, however 
with unknown parameters. We are given measurements of this state and the question is how to 
best merge these measurements with the model information to come up with a good approximation 
to u. 

A typical setting of this type occurs when all states of the physical process are described by a 
specihc parametric family of PDEs which is known to us, in a form 

V{u,n) = 0, 

where /r is a vector of parameters ranging in a hnite or infinite dimensional set V. Instead of 
knowing the exact value of /i which would allow us to compute the state u = u{pi) by solving the 
equation, we observe one of these states through some collection of measurements and we want 
to use these measurements, together with the known parametric PDE, to numerically capture the 
state, or perhaps even more ambitiously to capture the parameters. Since the solution manifold 

M := {u{p) : pL € P}, 

to a parametric PDE is generally quite complicated, it is usually seen through a sequence of nested 
finite dimensional spaces 


EoCPiC---CK, dim(P,)=j, 

such that each Vj approximates Ad to a known tolerance Ej. Construction of such spaces is some¬ 
times referred to as model reduction. Various algorithms for generating such spaces, together with 
error bounds e^, have been derived and analyzed. One of the most prominent of these is the reduced 
basis method where the spaces are generated through particular solution instances u{pd) picked from 
Ad, see [a [sum HI]. Other algorithms with known error bounds are based on polynomial approx¬ 
imations in the parametric variable, see 

Thus, the information that the state u we wish to approximate is on the manifold is replaced 
by the information of how well u can be approximated by the spaces Vj. Of course, this is not 
enough information to pin down u since we do not know where u is on the manifold, or in the 
new formulation, which particular element of Vj provides a good approximation to u. However, 
additional information about u is given by physical measurements which hopefully are enough to 
approximately locate u. This type of recovery problem was formulated and analyzed in m using 
an infinite dimensional Hilbert space setting which allows one to properly exploit the nature of the 
continuous background model when assimilating observations. This is also the setting adopted in 
the present paper. 
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The achievements of the present paper are two-fold. First, we establish that the algorithm 
proposed in m for estimating a state from a given set of observations and the knowledge of 
its approximability from a space Vn is best possible in the sense of optimal recovery. Second, 
and more importantly, we demonstrate the potential gain in accuracy for state recovery when 
combining the approximability by each of the subspaces Vj in the given hierarchy. We refer to this 
as the multi-space setting which will be seen to better exploit the information given by reduced 
bases or polynomial constructions. We give algorithms and performance bounds for these recovery 
algorithms in the multi-space setting when the observations are fixed and given to us. These 
algorithms are online implementable, similar to the ones discussed in [16]. Let us mention that 
one emphasis in m is on the selection of the measurement functionals in order to optimize the 
recovery process, while in the present paper we consider such functionals as given and focus on 
optimal recovery as explained above. 

1.2 Conceptual preview 

We study the above problems in the general framework of optimal recovery in a Hilbert space T-L 
with inner product (•, •) and norm || • ||. Under this setting, we are wanting to recover a function 
u GTi from its measurements ii{u) = {u, cjj), where the w* are known elements of Ti, i = 1,... ,m. 
If we denote by W the space spanned by the Wj, i = 1,..., m, then, the measurements determine 
w = Pwu where throughout this paper Px denotes the orthogonal projection onto X for any closed 
subspace X G T-L. In going further, we think of measurements as simply providing the knowledge 
of this projection. In particular, we assume that the Wj’s are linearly independent i.e., dim IT = m. 
Therefore, our problem is to find an approximation u{w) to u from the information w G W. This 
is equivalent to constructing a mapping A -.W ^ V. and setting u{w) = A{w) = A{P\yu). 

All elements of the orthogonal complement W'^ of W have zero measurements. A first obser¬ 
vation is that if all the information we have about u is that Pwu = w, then we cannot recover u to 
any guaranteed accuracy. Indeed, if uq satisfies the measurements then u could be any of the func¬ 
tions uq r], with T] G W-^, and each of these functions would be assigned the same approximation 
u = u{w). Therefore, we need additional information about u to have a meaningful problem. A 
typical assumption is that u is in some known compact set S G T-L. The recovery problem in this 
case is known as optimal recovery. A classical setting is that Ti is the space L2 and 5 is a finite 
ball in a Sobolev or Besov space, see e.g. OETllIHI. 

In contrast to the case where 5 is a known Sobolev or Besov ball, our interest is in the setting 
where S is the solution manifold A4 of a parametric PDF. As noted above, the typical way of 
resolving A4 is through a finite sequence of spaces {Vq, ... ,Vn} with I4 of dimension k where the 
spaces are known to approximate Ai to some known accuracy. This leads us to the following two 
settings: 

The one-space problem: We assume that all what we know about Ad is that there is a space 
Vn of dimension n which is an approximation to Ad with accuracy £„. Accordingly, we define 

K, := := {u gT-L : dist(u, Vn) < £n}, (1-1) 

and consider u G /C to be the only information we have about Ad. In this case, the information 
(II.ip is the additional knowledge we have about u. We want to combine this knowledge with our 
measurements Pwu to construct a good approximation u to u. So in this case, the spaces Vn and 
W are known and fixed. 
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The multi-space problem: We assume that what we know about A4 is that there is a sequence of 
spaces Vo C th C • • • C Fn such that each Vk has dimension k and approximates A4 with accuracy 
Ek, where Eq > > ■ ■ ■ £n > 0- This leads us to define 


1C ■= ;= Pi 

j=0 


where 


( 1 . 2 ) 


IC^ := {u GJi : dist(M, Vj) < Ej}, j = 0,... ,n. 

In this case, the information u G JC is the additional knowledge we have about u. We want to 
combine this knowledge with our measurements to construct a good approximation u to u. As 
already noted, the multi-space problem is typical when applying reduced bases or polynomial 
methods to parametric PDEs. 

1.3 Performance criteria 

This paper is concerned with approximating a function u G T-L from the information that u G JC 
and Pwu = u) in the two above settings. Note that in both settings, the set JC is not compact. The 
additional information provided by the measurements gives that u is in the class 

JCu, := {u G JC : Pwu = w}. 

This set is the intersection of JC with the affine space 

Tiw '■= {u G% ■. Pwu = w} = w + W^. 

Note that JC^ may be an empty set for certain w G W . 

Recall that an algorithm is a mapping A : W ^ T-L which assigns to any w G W the approx¬ 
imation u(w) = A[Pwu). In designing an algorithm, we are given the information of the spaces 
{Vk)k=o,...,n and the error bounds {Ek)k=o,...,n- There are several ways in which we can measure the 
performance of an algorithm. Consider first the one-space problem. A first way of measuring the 
performance of an algorithm is to ask for an estimate of the form 

||u — A(Pvi/^f)|| < C'A('a’) dist(tt, 14), uGJCw (1.3) 

The best algorithm A, for a given fixed value of w, would give the smallest constant (7^4(tc) and the 
algorithm which gives this smallest constant is said to be instance optimal with constant Ca{w). 
In this case, the performance bound given by the right side of ()1.3I) depends not only on w but on 
the particular u from JCw 

The estimate (11.31) also gives a performance bound for the entire class JCw in the form 

sup \\u — A{Pwu)\\ < CA{w)En- 

This leads us to the notion of performance of a recovery algorithm A on any set S G% which is 
defined by 


Ea{S) := sup ||rt — A{Pwu)\\. 
u£S 
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The class optimal performance on the set S is given by 


E{S) :=mfEAiS), 

A 


(1.4) 


where the infimum is taken over all possible algorithms, i.e., all maps A : W ^ T-L. In particular, 
class optimal performance is defined for both the single space or multi-space settings and for both 
the sets JCw for each of individual w which gives the measure E{JCw) or the entire class JC which 
gives the performance E{JC). The latter notion is the most meaningful when in applications it is 
not known which measurements w & W will appear or will be available. 

The present paper studies each of the above problems with the goal of determining the best 
algorithms. For this purpose, we introduce for any closed subspaces V and IT of ^ the quantity 


KV,W) := 


hll 

\\v - PvvW 


sup 




(1.5) 


A simple calculation shows that fj,{V,W) = I3{V,W) ^ where 


MV,W) := inf 

vSiV 


\\Pwv\\ 

Ikll 


inf sup 
w&w 



Note that in the case where V = {0} we have p(V, W) = 1. 

In ^of the paper, we analyze the one space problem, that is, JC = The inf-sup constant 

j3 was used in m, for the study of this problem, where the authors proposed an algorithm, in 
the form of a certain linear mapping A* : w ^ A*{w), then analyze its performance. While the 
approach in [16] is based on variational arguments, ours is quite different and geometric in nature. 
Our first goal is to establish that the algorithm proposed in m is both instance optimal and class 
optimal. We show that for any function u 


\\u-A*{Pwu)\\ < pL{Vn,W)disi{u,Vn). (1.6) 

Notice that if /3(Ti,IT) = 0, the above estimate would give no bound on approximation as is to 
be expected since Vn would contain elements of W-^ and these cannot be distinguished by the 
measurements. This would always be the case if n > m and so in going further we always work 
under the assumption that n <m. 

Let us note that this is a modest improvement on the estimate in m which has the constant 
Ai(I4)IT) -|- 1 rather than ij.{Vn,W) on the right side of (|1.6|1 . More importantly, we show that 
the estimate (frel) is best possible in the sense that the constant iJ.{Vn,W) cannot be replaced by 
a smaller constant. Another important remark, observed in m. is that in (|1.6h . dist(ri, Ti) can 
be replaced by the smaller quantity dist(u, T„ ® (IT n V^)). We establish, with our approach, the 
estimate 

\\u-A*{Pwu)\\ < pi{Vn,W)d\st{u,Vn® {W (1.7) 

which improves the constant given in m- We again show that pLiyn,W) is the best constant in 
estimates of this form. 

In view of (|1.6I) . the algorithm A* provides the class estimate 

Ea*{JC) < lM{Vn,W)en. (1.8) 


5 







We again show that this algorithm is class optimal in the sense that for the single space problem 

E{}C) = p{Vn,W)En. 

Our analysis is based on proving lower bounds which show that the upper estimates (ll.7h and (ll.8jl 
cannot be improved. These lower bounds apply to both linear and nonlinear algorithms, that is, 
([LTD and (jl.8|l cannot be improved also using nonlinear mappings. 

Another goal of our analysis of the one-space problem is to simplify the description of the 
optimal solution through the choice of, what we call, favorable bases for the spaces Vn and W. 
These favorable bases are then used in our analysis of the multi-space problem which is the object 
of ^ One possible way of proceeding, in the multi-space case, is to examine the right side of ()1.8p 
for each of the spaces {Vk)k=o,...,n, and choose the one which gives the minimum value. This would 
produce an algorithm A with the error bound 

Ea{1C) < min fi{Vk,W)£k. (1.9) 

0<k<n 

Notice that the £k are decreasing but the fi(yk,W) are increasing as k gets larger. So these two 
quantities are working against one another and the minimum may be assumed for an intermediate 
value of k. 

It turns out that the algorithm giving the bound (|1.9I1 may be far from optimal and our main 
achievements in ^are to produce both algorithms and a priori performance bounds which in general 
are better than that of (|1.9p . We show how the multi-space problem is connected to finding a point 
in the intersection of a family of ellipsoids in T-L and propose an algorithm based on this intersection 
property. Then, we give a priori bounds on the performance of our numerical algorithm, which are 
shown to be, in general, better than (II.9p . 

2 The one-space problem 

2.1 Preliminary remarks 

We begin with some general remarks which can be applied to our specific problem. If 5 C is 
a bounded set and we wish to simultaneously approximate all of the elements in S, then the best 
approximation is described by the center of the Chebyshev ball of S, which is defined as the smallest 
closed ball that contains S. To describe this ball, we first define the Chebyshev radius 

rad(5) := inf{r : S C B{v,r) for some v E Ti}. 

The following well known lemma says that the Chebyshev ball exists and is unique. 

Lemma 2.1 If S is any bounded set in % with R := rad(5), then there exists a unique v* & R 
such that 

ScB{v*,R). (2.1) 


Proof: For any u E "H, we define 

Rsiv) := inf {r : 5 C B{v, r)}. 
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which is a well-defined function from % to M. It follows from triangle inequality that Rs : —)• M 

is continuous. It is also easily seen that 


S C B{v,Rs{v)). 


By definition, rad(5) = inf^g-^ Now, consider any infimizing sequence {vj)j^fq, i.e., 

lim Rs{vj) = rad(5). 

j-^oo 


We claim that (uj)jgN is a Cauchy sequence. To see this, define rj := Rs{vj). For any fixed j and k 
and any z & S we define dj := Vj — z and '■= Vk — z. Then, ||dj|| < and ||dfc|| < r^. Therefore, 


\vj-Vk\?‘ = \\dj - dkW^ = {dj - dk,dj - dk) 

— 2{dj, dj) ‘2{dki dk) {dj T dk^ dj -|- dk) 

= 2||dj|p-|-2||(ifc|p — 4 —{dj+dk) 


< 2r| -I- 2rl — 4 


1 2 
^{Vj+Vk) - Z 


Since 2 ; E 5 is arbitrary we get 

\\vj - UfclP < 2r| + 2rl-A Rs(^^{vj + Vkij < 2r| 2rl - 4rad(5)^. 


Since rj,rk —)• rad(5), this shows that (uj)jgN is a Cauchy sequence and has a limit v*, which by 
the continuity of u i->- Rs{v) satisfies Rs{v*) = rad(5). The uniqueness of v* also follows from the 
above inequality by contradiction. By using the continuity of u Rs{v) one easily shows that 
(j2.ip holds. □ 


We sometimes say that v* in the above lemma is the center of S. For any bounded set S, the 
diameter of S is related to its Chebyshev radius rad(5) by the inequalities 

rad(5) < diam(5) < 2rad(5). 

For general sets S these inequalities cannot be improved. However, we have the following remark. 

Remark 2.2 Let S be symmetrie about a point z, i.e. whenever u E 5, then 2z — v & S. Then, 
the Chebyshev radius of S equals half its diameter, that is, diam(5) = 2rad(5) and its center is z. 

Remark 2.3 In the particular setting of this paper, for any given w ^ W sueh that fCw is non¬ 
empty, the optimal reeovery u*{w) over the class JC^ is obviously given by the eenter of ICw, and 
the class optimal performance is given by 

EiJCuj) = rad(/Cu,). 

Remark 2.4 For a bounded, closed, eonvex set S C R (which is always the ease in this paper) 
its center u is in S. In fact, if this was not true, by translating S, we ean assume u = 0. Let 
So = argmin^g 5 ||s||. By eonvexity sq exists, sq / 0, and {s,so) > {so,so), s gS. Thus 

sup ||s - SolP = sup((s, s) - 2{s, So) + (so, So)) < sup ||s||^ - llsoll^ 
whieh eontradicts the assumption that 0 is the center of S. 
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2.2 Optimal bounds for the one-space problem 

We next consider the case where the set K, = is given by where W is a fixed and known 

n dimensional space. In this section, we derive the algorithm proposed in |16] . however from a dif¬ 
ferent point of view emphasizing more the optimal recovery and geometric aspects of the problem. 
This allows us to improve on their estimates some but, more importantly, it is also useful when 
treating the multi-space problem. 

In the event that (3{Vn,W) = 0, the space Vn contains elements from W-^ which implies that if 
w G W is such that ICw is non-empty, then ICw is unbounded, or equivalently rad(/C^) is infinite, 
which means that we cannot hope for any guaranteed performance over fCw ■ This is the case in par¬ 
ticular when n > m. For this reason, in the rest of the paper, we always assume that jdiVn, W) > 0, 
which means in particular that n <m. 

Let w be any element from W. We claim that the map 

\\u — Pvr,u\\ = ||Pyxn||, 

admits a unique minimizer over the affine space Hw To see this, we let uq be any element from 
Hw It follows that every u G Hw can be written as u = uq -I t] for some rj G W-^. Minimizing 
||Pyxu|| over FLyj therefore amounts to minimizing the function 

V ^ fiv) ■■= \\Pvf-Uo + Pvf-Vf, 

over W'^. We may write 


fiv) ■=9iv) + 

where g is an affine function. Since we have assumed that /3(W, kb) > 0, the inequalities 

/ 3 iVn,W)M\ < WPyrvW < WvW, rjGW^. 

show that r] i—)• ||TVxr/|| is an equivalent norm over W-^. Therefore r/ i—)• f{r]) is strongly convex 
over W-^ and therefore admits a unique minimizer 

ij* := argmin/(?/). 


It follows that u* = uq -I Vj* satisfies 

u* = u*iw) := argmin ||rt — TV„rt|| 
u&Hw 

and that this minimizer is unique. 

Remark 2.5 If w is such that ICw is non-empty, there exists au G FLw such that ||u — 

Therefore \\u* — PvnU*\\ < £n, that is, u* G ICw- In particular, u* minimizes ||rt — TV„^|| over all 
u G ICw . 


We next define 


V* := v*{w) := Pv„u*- 

From the definition of u*, it follows that the pair (u* ,v*) is characterized by the minimization 
property 

lln*— 1 ;*||= min lln —?;||, (2.2) 

U&H^,veVr, 

As the following remark shows, u* — v* has a certain double orthogonality property. 

Remark 2.6 The element u* — v* is orthogonal to both spaces Vn and W-^. The orthogonality to 
Vn follows from the fact that v* = Py^u*. On the other hand, for any rj G W-^ and a G M, we have 

II* *ii2^ii T~i ii2 *1 

||u — u II < ||u — Pv„u\\ , u := u + arj, 


and thus 

\\u* - < ||u* -v*+ a{r] - = ||u* - + 2a{u* - v*,r]) + a^\\r] - Py„? 7 |p. 

This shows that u* — v* is orthogonal to W-^. 

Remark 2.7 Conversely, if u G O'lT'd v G Vn are such that u — v is orthogonal to both spaces 
Vn and W-^, then u = u* and v = v*. Indeed, from this orthogonality 

II* *ii2 II ii2iii* * / ^ll2 

||n —V II = ||tt — u|| + ||u —V — [u — v)\\ . 

This gives that u,v is also a minimizing pair and from uniqueness of the minimizing pair u = u* 
and V = V*. 

The next theorem describes the smallest ball that contains JC^, i.e., the Chebyshev ball for this 
set, and shows that the center of this ball is u*{w). 

Theorem 2.8 Let W and Vn be such that /3{Vn,W) > 0. 

(i) For any w G W such that fCw is non-empty, the Chebyshev ball for ICw is the ball centered at 
u* (w) of radius 

R* =R*{w) := fi{Vn,W)iel-\\u*{w)-v*{w)f)^/^. (2.3) 

(ii) The optimal algorithm in the sense of (m for recovering JCw from the measurement w is given 
by the mapping A* : w u*{w) and gives the performance bound 

EA*{lCn,) = E{]Cn,) = pi{Vn,W){el - ||u*(w) - v*{w)ff/\ (2.4) 

(hi) The optimal algorithm in the sense of dH for recovering JC is given by the mapping A* : w ca 
u*{w) and gives the performance bound 

Ea* (A) = E{}C) = fi{Vn, W)en. (2.5) 
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Proof: In order for /C^ to be nonempty, we need that ||rt* — r’*|| < Sn- Any u G Tiw can be written 
as u = u* + rj where rj G W'^. Therefore, 

u - Py^u = u* -V* +r] - Pvnil- 

Because of the orthogonality in Remark 12.61 we have 

\W - Pvr^uf = ||u* - v*f + \\r] - Pvr^vf- (2.6) 

Thus a necessary and sufficient condition for u to be in IC^ is that 

WPy^^f = h - Pvr.'nf < 4 - \\u* - Pf- 

From the dehnition of /u(I4, W), this means that any u G ICw is contained in the ball B{u* (w), R* (w)) 
Now, if T] is any element in IT-*- with norm R*{w) which achieves the maximum in the definition of 
W), then u* ±r] is in ICw and since ||r/|| = R*{w) we see that the diameter of /C^ is at least as 
large as 2R*{w). Since /C^ is the translation of a symmetric set, we thus obtain (i) from Remark 
[221 The claim (ii) about A* being the optimal algorithm follows from Remark 12.81 Finally, the 
performance bound (|2.5I) in the claim (iii) holds because the maximum of R*{w) is achieved when 
tc = 0. □ 

Remark 2.9 The optimal mapping w A*{w) = u*{w) is independent of £n and the knowledge 
of En is not needed in order to compute A*{w). 

Remark 2.10 Since is the intersection of the cylinder JC with the affine space TLw, it has the 
shape of an ellipsoid. The above analysis describes this ellipsoid as follows: a point u* +r] is in ICw 
if and only if ||PvTr?|P < —v*\\‘^. In the following section, we give a parametric description 

of this ellipsoid using certain coordinate systems, see Lemma \2.1Ji\ 

Remark 2.11 The elements u* and v* were introduced in m and used to define the algorithm 
A* given in the above theorem. The analysis from m establishes the error bound 

lk-n*(u;)|| < ifi{Vn,W) + l)dist{u,Vn(B{V^nW)). 

A sharper form of this inequality can be derived from our results. Namely, if u is any element in 
TL then we can define En '■= \\u — Pv„u\\ and w := Pwu. Then, u G /Ci„, for this choice of En, and 
so Theorem \2.^ applies and gives a recovery of u with the bound 

||n - u*(u;)|| < fi{Vn, W){eI - ||u* - = ^(^Vn, W)\\u - Pv^u - {u* - u*)||, (2.7) 

where the second equality follows from (12.6|) . We have noticed in B.ew.a,rk \2.f^ that u* —v* G R^nlT, 
and on the other hand we have that u — {u* — u*) G 14, + IF"*", which shows that 

u* -V* = Pyrnw'^- 

Therefore 


and (ETD gives 


Pvn'u + u* -v* — 


U-U*iw)\\ < ^liVn,W)distiu,Vn(BiV^r]W)). 
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Remark 2.12 Let us observe that given a space Vn with n < m we have iWCiV^) / {0}, thus the 
space Vn '■= Vn® {W n V^) is strictly larger than Vn- However pL{Vn,W) = p,{VmW) because for 
any rj G W-^, the projection of rj onto W H is zero. In other words we can enlarge Vn preserving 
the estimate (j2.5j) for class optimality performance as long as we add parts of W that are orthogonal 
to Vn- 

2.3 The numerical implementation of the optimal algorithm 

Let us next discuss the numerical implementation of the optimal algorithm for the one-space prob¬ 
lem. Let u!i,... ,0Jm be any orthonormal basis for W. For theoretical reasons only, we complete 
it to an orthonormal basis for H. So is a complete orthonormal system for W'^. We can 

write down explicit formulas for u* and v*. Indeed, any u G Hw can be written 

m oo 

U = ^ WiUJi ^ XiUJi, 

2 = 1 2=m+l 

where Wi := {w,uji), and {xi)iyrn is any £2 sequence. So, for any v ^ Vn and u G Hw, we have 

m CO 

\\u - vf ='^{Wi - Vif + {Xi-Vif, 

2=1 2=m+l 

where Vi := {v,Ui). Thus, for any v G Vn, its best approximation u{v) from Hu, is 

m CO 

u{v) :='^WiUJi+ ^ ViOJi, (2.8) 

2=1 2=m+l 

and its error of approximation is 

m 

||u - u(u)|p = '^{Wi - Vif. 

2 = 1 

In view of (|2.2|) we have 

m 

V* = argmin ||u — u{v)\\‘^ = argmin {wj — Vif' = argmin ||u; — Py/v\\^ . 

V&Vn V&Vn V&Vn 

For any given orthonormal basis {(pi, • • • , (pn} for Vn, we can find the coordinates of u* G I 4 , in this 
basis by solving the n x n linear system associated to the above least squares problem. Once v* is 
found, the optimal recovery u* = u*{w) is given, according to (|2.8p . by 

m 

u* =v* + ^{wi - v*)ui, 

2=1 

where v* = {v*, 0 Ji). Note that we may also write 

m CO 

U* = '^WiU:i® ^ {v*,uJi)uJi = w + . (2.9) 

2=1 2=m+l 


11 



2.4 Liftings and favorable bases for 14 and W 

It turns out that the above optimal algorithm has an even simpler description if we choose suitable 
bases for I 4 and W, which we call favorable bases. These bases will also be important in our analysis 
of the multi-space problem. To describe this new geometric view, we introduce the description of 
algorithms through liftings and see how the best algorithm of the previous section arises in this 
context. 

As noted earlier, any algorithm is a mapping A : W ^ T-L which takes w = Pwu into u{w) = 
A{w) = A{P]yu). This image serves as the approximant of all of the u G /C^. We can write any 
u G Kw as u = w + Pyy±u. So the problem is to hnd an appropriate mapping F : W ^ W-^ and 
take as the approximation 


u{'w) := A{w) -.= w + F{w). 

At this stage F can be any linear or nonlinear mapping from W into W-^. We call such mappings 
F liftings. 

According to (|2.9p . the optimal lifting F* is defined by 

F*{w) = Pw^v*{w) e Pw±Vn, 

which is actually a linear mapping since v* depends linearly on w. The algorithm A* (w) = w+F* (w) 
was shown in the previous section to be optimal for each class JC^ as well as for JC. Note that this 
optimality holds even if we open the competition to nonlinear maps F, respectively A. 

We next show that F* has a simple description as a linear mapping by introducing favorable 
bases. We shall make use of the following elementary facts from linear algebra: if X and Y are 
closed subspaces of a Hilbert space F, then: 

• We have the equality 


dim(PxT) = dimiPyX). 

This can be seen by introducing the cross-Gramian matrix G = {{xi,yj)), where (x*) and 
{yj) are orthonormal bases for X and Y. Then G is the matrix representation of the pro¬ 
jection operator Px from Y onto X with respect to these bases and G^ is the corresponding 
representation of the projection operator Py from X onto Y. Hence, 

dim(PxT) = rank(G) = rank(G*) = dim(PyX). 

• The space Y can be decomposed into a direct orthogonal sum 

Y = PyX®{YnX^). ( 2 . 10 ) 

For this, we need to show that Y n X-^ = Z where Z C Y is the orthogonal complement of 
PyX in Y. If y G Z, then {y, Pyx) = 0 for all x G X. Since (y, x — Pyx) = 0, if follows that 
(y, x) = 0, for all x G X, and thus y G T n X-*-. Conversely if y G T n X-*-, then for any x G X 
(y, Pyx) = — (y, X — Pyx) = 0, which shows that y & Z. 
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Now to construct the favorable bases we want, we begin with any orthonormal basis {4>i,..., cpn} 
of Vn and any orthonormal basis {wi,... ,0Jm} of W. We consider the mx n cross-Gramian matrix 

G := 

which may be viewed as the matrix representation of the projection operator P\y from Vn onto W 
using these bases since P\Y{4>j) = Note that the inf-sup condition (3{Vn,W) > 0 

means that 


dim(Piyl4) = n, 

or equivalently, the rank of G is equal to n. We perform a singular value decomposition of G, which 
gives 

G = USV^ 

where U = (uij) and V = (vij) are unitary mxm and nxn matrices, respectively, and where S is 
an m X n matrix with entries > 0 on the diagonal i = j, i = 1,... ,n, and zero entries elsewhere. 
This allows us to define new orthonormal bases {(/)^,..., (/)* } for Vn and {wj',... ,uj^} for W by 

n m 

(j)* = and u* = 

i=l i=l 

These new bases are such that 

Pw{4>)) = SjUJ*, j = l,...,n, 
and have diagonal cross-Gramian, namely 

= sjSij. 

Therefore ..., w*} and ...,are orthonormal bases for the n-dimensional space 

Pw^n and respectively its orthogonal complement in W which is Vj^ H W according to (I2.10p . 

By convention, we organize the singular values in decreasing order 

b ^ — ^n—1 — * * * — 

Since Pw is an orthogonal projector, all of them are at most 1 and in the event where 

Si S2 ■■■ Sp 1, 

for some 0 < p < n, then we must have 

This corresponds to the case where 14, G W is non-trivial and {wi,... ,a;*} forms an orthonormal 
basis for 14 n W. We define p = 0 in the case where 14 G IT = {0}. 

We may now give a simple description of the optimal algorithm A* and lifting F*, in terms of 
their action on the basis elements cut. For j = n -|- 1,..., m, we know that cut g n W. From 
Remark 12.71 it follows that the optimal pair (u*, v *) which solves (12.2p for w = oj* is 

u* = cu* and v* = 0, 
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and therefore 


A*{oj*)=oj* and F*{uj*) = 0, j = n + 1,... ,m. 

For j = 1,..., n, we know that u* = PwisJ^4>*). It follows that the optimal pair {u*,v*) which 
solves (|2.2I) for w = Uj is 

* * — 1 / * 

u =v = Sj (pj. 

Indeed, this follows from Remark 12.71 since this pair has u* — v* =0 and hence has the double 
orthogonality property. So, in this case. 


= and F*{uj*) 

Note in particular that F*{cOj) = 0 for j = 1,... ,p. 



Remark 2.13 The favorable bases are useful when computing the inf-sup constant /3(I4)IF). 
Namely, for an element v = ^ we find that Pwv = 


/3(K,II") 


min 

V&Vn 


\\Pwv\\ 

Ikll 


mm 

V&Vn 


E n 
7 = 


i=i 


min Sj 


— Sn- 


Correspondingly, 


ix{yn,w) = s-\ 


Recall that for the trivial space Vq = {0}, we have /r(Vb) = 1- 


For further purposes, we complete the favorable bases into orthonormal bases of Td by con¬ 
structing particular orthonormal bases for and W-^. According to (I2.10p we may write these 
spaces as direct orthogonal sums 

n IR^), 

and 

= Pw4Vn)e{v^nw^). 

The second space n W-^ in the above decompositions may be of infinite dimension and we 
consider an arbitrary orthonormal basis for this space. For the hrst spaces in the above 

decompositions, we can build orthonormal bases from the already constructed favorable bases. 

For the space Pyx (IT) we hrst consider the functions 

These functions are 0 for i = 1,... since co* G Vn for these values of i. They are equal to oj* for 
i = n + 1 , ...,m and to uj* — Sicp* for i = p+ 1 , ... ,n, and these m—p functions are non-zero pairwise 
orthogonal. Therefore an orthonormal basis of Pyx(IT) is given by the normalized functions 

{1 — sj)~^P{uj* — Si(f>l), i = p + l,...,n, and uj*, i = n + l,...,m. 
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By a similar construction, we find that an orthonormal basis of P\y± (14.) is given by the normalized 
functions 


(1-s?) i = P+1,... ,n. 

Therefore bases for and W-^ are defined as union of these bases with the basis (4* )i>i for 

v^r^w^. 

Finally, we close out this section, by giving a parametric description of the set = /C^(14) 
for the single space problem which shows in particular that this set is an ellipsoid. 

Lemma 2.14 Given a single spaee Vn CP, the body 

:= ICy^iVn) := /Cr (K) := {u G : Pwu = w} 

is a non-degenerate ellipsoid eontained in the affine spaee Pw 

Proof: Using the favorable bases for W and W-^, we can write any u G Pw as 

m n 

j=l j=P+l i>l 

where the wj = {w, cup for j = 1,..., m, are given, and the Xj and yj are the coordinates of tt — tc 
in the favorable basis of W'^. We may now write 

m n 

WjPy±uj* + Y - 4-^i) + Y 

j=i i=p+i *>i 

m n 

= Y - Y “ sjy^/^Sjiuj* - Sj(j)*) + Yy^'^j 

j=p-\-l j=p-\-l i>l 

m n 

= Y + Y yj'^j ■ 

j=n-\-l i=p+i *>i 

All terms in the last sum are pairwise orthogonal and therefore 

m n 

\\Pv+uf = Z] X] + 

j=p+i i>i 


Now u G KLw if and only if ||PiALn|p < e^, or equivalently 


Y s]{xj -ajf + Yyl - 

i=p+i i>i 

with C ■.= e\ — yjLn+ii^ ~ “ sJ)^^^s~^Wj which is the equation of a non¬ 
degenerate ellipsoid in P^}. □ 

Remark 2.15 The above equation (12.lip directly shows that the radius of Pw is equal to 
whieh is an equivalent expression of 
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3 The multi-space problem 


In this section, we consider the multi-space problem as described in the introduction. We are 
interested in the optimal recovery of the elements in the set K, := as described by For 

any given tc € VF, we consider the set 

n 

y' ._ tumult ._ i^mult nj _ V'i 

^ ^ I I 

3=0 

where 

:= K,^ n T-Lw := {u G Hu, : dist(ri, Vj) < £j}. 

In other words, ICi, is the set in the one-space problem considered in the previous section. We have 
seen that JCi, is an ellipsoid with known center (w) and known Chebyshev radius given by 

(j2.3p with n replaced by j, and u* and v* replaced by u* and v* in that formula. 

Thus, KLu, is now the intersection of n-|-l ellipsoids. The optimal algorithm A*, for the recovery 
of KLu], is the one that would find the center of the Chebyshev ball of this set and its performance 
would then be given by its Chebyshev radius. In contrast to the one-space problem, this center 
and radius do not have simple computable expressions. The first results of this section provide an 
a priori estimate of the Chebyshev radius in the multi-space setting by exploiting favorable bases. 
This a priori analysis illustrates when a gain in performance is guaranteed to occur, although the 
a priori estimates we provide may be pessimistic. 

We then give examples which show that the Chebyshev radius in the multi-space case can be 
far smaller than the minimum of the Chebyshev radii of the /Ci, for j = 0,..., n. These examples 
are intended to illustrate that exploiting the multi-space case can be much more advantageous than 
simply executing the one-space algorithms and taking the one with best performance, see (|2.4p . 

The latter part of this section proposes two simple algorithmic strategies, each of them con¬ 
verging to a point in fCy,. These algorithms thus produce a near optimal solution, in the sense that 
if A is the map corresponding to either one of them, we have 

Ea{}^w) ^ = 2E(JCw)-i w G IF, (3-1) 

and in particular 

Ea{1C) < 2E{K). (3.2) 

Both of these algorithms are iterative and based on alternating projections. An a posteriori estimate 
for the distance between a given iterate and the intersection of the ellipsoids is given and used both, 
as a stopping criteria and to analyze the convergence rates of the algorithms. 

3.1 A priori bounds for the radius of 

In this section, we derive a priori bounds for rad(/C™“^*). Although these bounds may overestimate 
rad(/C™'^^*), they allow us to show examples where the multi-space algorithm is significantly better 
than simply chosing one space and using the one-space algorithm. Recall that for the one-space 
problem, we observed that rad(/C°“®) is largest when re = 0. The following results show that for the 
multi-space problem rad(/C™'^^*) is also controlled by rad(/C™'^^*), up to a multiplicative constant. 
Note that is generally not a symmetric set, except for w = 0. In going further in this section 

fC and ICu, will refer to the multi-space sets. 
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Lemma 3.1 For the multi-space problem, one has 


Tad{}Cw) < 2rad(/Co), w ^W. 


(3.3) 


Therefore, 


E{IC) < 2rad(/Co). 


(3.4) 


Proof: Fix w and let u := u{w) be the center of the Chebyshev ball for JC^ which by Remark 
12.41 belongs to IC^. For any u G /C^„ we have rj := ^{u — u) is in IR-*- and also 

dist(r/, 14 ) < ^(dist(ti, 14 ) + dist(h, 14 )) <£k, k = 0,l,... ,n. 

Hence, rj ^ JCq which gives 


||u — h|| = 2||77|| < 2rad(/Co), 

where we have used the fact that, by Remark 12.21 the best Chebyshev ball for JCq is centered at 0. 
This proves ()3.3p . The estimate (|3.4p follows from the definition of E{JC). □ 


In view of the above Lemma l3.11 we concentrate on deriving a priori bounds for the radius of 
the set /Cq. We know that /Cq is the intersection of the ellipsoids /Cg for j = 0,1,... ,n, each of 
which is centered at zero. We also know that the Chebyshev ball for /Cg is i?(0,rad(/Cg) and we 
know from (12.41) that 

rad(/C^) = ^i{Vj,W)ej, j = 0,1,..., n, 
which is a computable quantity. This gives the obvious bound 

rad(/Co) < min n{Vk,W)£k. (3.5) 

0<k<n 

In the following, we show that we can improve on this bound considerably. Since JCq is symmetric 
around the origin, we have 

rad(/Co) = argmax ||7/||. 

So we are interested in bounding ||r/|| for each p G JCq. 

Since the spaces Vj are nested, we can consider an orthonormal basis {cfi,... ,(fn\ for 14, for 
which, {(fi,..., (fj} is an orthonormal basis for each of the Vj for j = 1,..., n. We will use the 
favorable bases constructed in the previous section in the case of the particular space 14- Note 
that if {(^4 • • •) ^*n\ is the favorable basis for 14 , we do not generally have that {(fi\, ...,</>*} is a 
basis of Vj . 

Let rj be any element from JCq. Since dist(T/, 14) < we may express p as 

n n 

7/ = ^ rjjf)* + e = ^ aj4>j + e, e G 14*“ and ||e|| < e„. 
i=i i=i 

So, 

n n 

ll^ll'= + = E “I + Ill'll'' 

i=i j=i 
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The aj and rjj are related by the equations 

n 

i=i 

where 


Xij := {4>j,(j)*), l<i,j<n. 

The fact that dist(7/, I 4 ) < for A: = 0,..., n is expressed by the inequalities 

n 

^ a) + \\ef <el, k = 0,...,n. 
j=k+l 


Since rj G W-^, we have that 

n 

0 = PwV = ^ SjrjjOj* + Pwe ■ 
i=i 


It follows that 


= \\Pwef < ||ef < el 


i=i 


We now return to the representation of /Cq in the (j)j coordinate system. We know that all aj 
satisfy \aj\ < Sj-i. This means that the coordinates {ai,... ,an} of any point in /Cq are in the 
n-dimensional rectangle 


R = [-So, eo] X • • • X [-en-i,en-i]. 

It follows that each r]i satisfies the crude estimate 

n n 

{■nil <'^\Xi,j\\aj\ <'^\\ij\ej-i =: 9i i = (3.6) 

i=i j=i 

The numbers 6i are computable. The bound (|3.6p allows us to estimate 

n n 

rad(/Co)^ = sup \\rif < + sup | ^ rj] : \r]j\ < Oj and ^ sj'q] < 4 } 

Since the Sj are non-increasing, the supremum on the right side takes the form 

n 

+ E ^1’ 0 < ,5 < 1 , 

j=k+l 

where k is the largest integer such that 

n 

E ‘Pi £ 4. (3.7) 

j=k 
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and 5 is chosen so that 


(3.8) 


j=k+l 

This gives us the following bound on the Chebyshev radius of JCq. 

n 

rad(/Co)^ <£^ + <50^+ ^ := El. (3.9) 

j=k+l 

Using this estimate together with Lemma l3.11 we have proven the following theorem. 


Theorem 3.2 For the multi-space problem, we have the following estimates for Chebyshev radii. 
For /Co, we have 


rad (/Co) < En, 

where En := Ul + 501 + Ei=fc+i <^1) . For any w £ W, we have 

rad(/C^) < 2En. 


For JC, we have the bound 


rad(/C) < 2En. 

We next compare the bound in ()3.9I1 with the one space bound 

rad(/Co) < /i( 14 , W)e„ = 

which is obtained by considering only the approximation property of Vn and not exploiting the 
other spaces Vj, j < n, see (j3.5p . For this, we return to the definition of k from (j3.7|] . We can write 
each term that appears in (j3.8l) as where Yl^=k'^j = 1. In other words, 

k<j<n, el = 

Hence, 

g .2 —2 2 ^ 

— '-n ' ‘-n — ^'’n 

which is at least as good as the old bound up to a multiplicative constant y/2. 

We finally observe that the bound En is obtained by using the entire sequence {Vq, ■ ■ ■, Vn}- 
Similar bounds Er are obtained when using a subsequence {Vj : j G F} for any F C {0, ... ,n}. 
This leads to the improved bound 

rad(/Co) < minlFlr : F C {0,... , n}}. 

In particular defining Ej = E-p for F = {0,..., j} we find that 

E] < 2fr{Vj, Wfe]. 

Therefore 

El < ^2 min p.(Vj,W)ej, 

j=0,...,n 

which shows that the new estimate is as good as (|3.5p up to the multiplicative constant ^/2. 
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3.2 Examples 

One can easily find examples for which the Chebyshev radius of is substantially smaller than 
the minimum of the Chebyshev radii of the therefore giving higher potential accuracy in the 
multi-space approach. As a simple example to begin this discussion, consider the case where 

= Eo = {0}, Ei = Mei, lE = M(ei + e 2 ) 

where ei = (1,0) and 62 = (0,1). So, Vi and W are one dimensional spaces. Then, with the choices 

V3 + 1 V3 + l\ 

4 ’ 4 j’ 

it is easily seen that JC^ is the single point and has therefore null Chebyshev radius while 

/C® and have positive Chebyshev radii. 

In more general settings we do not have such a simple description of JC^, however we now give 
some additional examples that show that even the a priori estimates of the previous section can be 
significantly better than the one space estimate as well as the estimate (13.5h . We consider the two 
extremes in the compatibility between the favorable basis ..., (/>*} and the basis ..., 
which describes the approximation properties of the sequence {Vq, ..., Vn}. 


£0 — 1 ) 


"^ = 2 ’ 


w = 


Example 1: In this example we consider the case where the two bases coincide, 

4 >* = i = l,...,n. 

Note that in this case the singular values {si,..., Sk} for the pair {I 4 , W} coincide with the first 
k singular values for the pair {Vn,W}. Therefore 

n{Vk,W) = s^^, k = 0,...,n, 

where we have set sq := 1- We also have 

8k — k — 1,..., n. 

We fix £ri ■= £ and £n-i ■= £n -2 '■= and the values Sn := e and := Sn -2 := and all 
other £k := 1 and all other Sk '■= 1- We examine what happens when e is very small. The estimate 
(|1.9h would give the bound 

min /i(Vfc,IT)efc= min sl^£k = l, 

as the bound for rad(/Co) and E{IC). On the other hand, since, 

'^n^n—1 ^ ^ ^ ana c , 

the value of k in ()3.7j) is n — 1. It follows that the error in the multi-space method (j3.9h satishes 

En < en-2 + ^n-1 + ^ 

Hence, the error for the multi-space method can be arbitrarily small as compared to the error of 
the one-space method. 
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Example 2: We next consider the other extreme where the two bases are incoherent in the 
sense that each entry in the change of basis matrix satisfies 

lAjjl < l<i,j<n. 

We want to show that En can be smaller than the estimate in (j.S.5h in this case as well. To illustrate 
how the estimates go, we assume that n > 2 and |Ajj| = l/\/n, for all 1 <i,j <n. We will take 

Sn ^ S Si S2 • • • Sji_i, 


with the values of s and Sn specified below. We define 
£q := 1/2 and £j = 


2(n- 1)^ 


j = 1 ,... ,n - 1, 


so that ~ follows from the definiton of Ok given in (13.6p that 


Ok = \fn ■.= 0^ k = l,...,n. 
With these choices, the best one space estimate m is 

min{eo, s~^£n-i,s~^en}. 

Now, we take £n very small and Sn = AV® then choose s so that 

{s^ + sl)0^ = el 


(3.10) 


(3.11) 


This gives A; = n — 1 in (13.7p and so 

-®n = + ^n-l + O'^ < 3n 

On the other hand, (|3.1ip says that s“^ = £n^{n — Thus, from (|3.10p . the best one space 

estimate is 

min{eo,s"^en-i,Sn^en} = min|i,—- J' £n\£n^| = 1/2, 

2 2(n — 1) V ^ ^ 

provided Hence, the multi-space estimate (j3.9p is better than the one space estimate 

by at least the factor in this case. 

3.3 Numerical algorithms 

In this section, we discuss some possible numerical algorithms, based on convex optimization, for 
the multi-space case. For any given data w G W, such that ICw is not empty, these algorithms 
produce, in the limit, an element A{w) which belongs to JC^, so that they are near optimal in the 
sense of ()3.1I) and (|3.2I) . 

We recall that ICw is given by 

n/c° n/c^ n • • • n/C”. 
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One first observation is that although the set may be infinite dimensional, we may reduce the 
search for an element in IC^ to the finite dimensional space 

F:=Vn + W, 

which has dimension d = m + n — p, where p = dim(14 O W). Indeed, if u € /C^, then its projection 
Pj^u onto P remains in , since u — Pj^u E W-^ H implies 

PwPtu = Pw'iJ' = w, 


and 


dist(Pj-u, Vj) < dist(tt, Vj) < Sj, j = 0,..., n. 

Therefore, without loss of generality, we may assume that 

'H = P, 

and that the sets T-Lw and JC^ that define IC^u are contained in this finite dimensional space. 

The problem of finding a point in the intersection of convex sets is sometimes referred to as 
convex feasibility and has been widely studied in various contexts. We refer to pun] for surveys on 
various possible algorithmic methods. We restrict our discussion to two of them which have very 
simple expressions in our particular case. Both are based on the orthogonal projection operators 
onto the spaces Tiw and IC^. Let us first observe that these projections are very simple to compute. 
For the projection onto T-Lw, we use the orthonormal basis {wi,... ,Ldm} of W. For any u G P we 
have 

m 

= Pw^'^ + w = u- '^{u,Ui)uJi + w. (3-12) 

i=l 

For the projection onto IC^, we extend the basis {4)i ,..., into an orthonormal basis {(/>i,..., (fd) 
of P. We then have 

j d d \ 1^21 

PK^u = ^{u,f)i)f)iP a := min|l,ej(^ ^ j. 

2=1 i=j-\-l 

We now describe two elementary and well-known algorithms. 

Algorithm 1: sequential projections. This algorithm is a cyclical application of the above 
operators. Namely, starting say from = w, we define for A: > 0 the iterates 

:= P^-Picr^-i • ■ • P^iP^oPn^u’^. 

We know from general results on alternate projections onto convex sets [I] that this sequence con¬ 
verges towards a point u* E JCw when ICy^ is not empty. We make further use of the following ob¬ 
servation: the nestedness property Vq C Vi C ... C Vji implies that belongs to 1C = . .H/C”'. 

Algorithm 2: parallel projections. This algorithm combines the projections onto the sets 
1C according to 

n 

:= Ph^{'£pP!C^)u\ 
j=0 
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where the weights 0 < 7 ^ < 1 are such that 70 + • • • + 7 n = 1, for example 7 ^ := It may be 
viewed as a projected gradient iteration for the minimization over T-Lw of the differentiable function 

” 1 
Fiu) ■= Fj{u) := -dist{u,Vf. 

j=0 

Notice that the minimum of F is attained exactly at each point of /C. Since VFj{u) = u — PjciU, 
we find that 


= PnAu^ -VF{u^)). 

Classical results on constrained minimization methods m show that this algorithm converges 
toward a minimizer u* of F{u) over which clearly belongs to JC^ when JC^ is not empty. 


3.4 A posteriori estimate and convergence rates 

Each of the above algorithms generates a sequence {u^)k>i of elements from P which are guaranteed 
to converge to a point in fCw provided that this set is nonempty. We would like to have a bound 
for dist(n^,/C^), since this would allow us to check the progress of the algorithm and also could 
be utilized as a stopping criterion when we have gained sufficient accuracy. Here we restrict our 
analysis to Algorithm 1. 

We will use certain geometric properties of the set JC, expressed by the following lemma. 
Lemma 3.3 Ifui,U 2 G JC then the ball B := B{uo,r) centered at uq := \{ui + U 2 ) of radius 


r := 


1 

8 


min e- 

j=0,...,n J 


j 


3 


(3.13) 


is completely contained in JC. 

Proof: For tti, tt 2 G JC^ the ball B{uo,r) is contained in JC^ if and only if the ball in centered at 
Py±uo with the radius r is contained in Py3_{JCJ) = {x G : \\x\\ < Cj} '.= Bj. Let vi := Py±{us) 
for s = 0 , 1 ; 2 and let 5j := Unj — n^ll- parallelogram identity gives 


I i 112 ^11 7112 I ^11 i 112 ^11 ?' 7112 

l^oll = 2 IIAII + 211^211 - iWi-viW , 


so that ||n^|p < e^ — \d‘j. Thus for 




W-l 


the ball in centered at Ug with radius rj is contained in Bj. Thus, with 


p := min r,, 

we have B{uo,p) C JC. Since 6j < 2sj and (1 — y/1 — x) > x/2 for 0 < x < 1 we get rj > (f|/( 8 ej) 
and therefore p > r from which (|3.13p follows. □ 
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We have noticed that the iterates of Algorithm 1 all belong to JC and we would like to 
estimate their distance from the convex set ICw Let Pic^{x) denote the point from fCu, closest to 
X. This is a well dehned map. The following result gives an estimate for the distance of any u G /C 
from fCw, in terms of its distance from the affine space Pw ■ This latter quantity is easily computed 
using (I3.12P which shows that 


m 

u - Pn^u = Pwu -w = - w. 

i=l 


Lemma 3.4 Let u & JC be such that 


a := dist(u,'H«,) > 0. 


Then 

where fij 


\\Pn^u - Pk^u\\ < p = p{a) := maxpj{a + 

3 

p{Vj, W). Since u — Ph^u is orthogonal to Pu^u — Pic^u, we have 


dist(u, JCw)^ <a^ + p{af‘. 

Proof: We set U2 = Pk^ u and p = u — U2 which we decompose as 


(3.14) 


v = (u- Ph^u) + {Pu^u - U 2 ) =■■ Pi + P 2 - 

We wish to show that \\p 2 \\ < P, where p is dehned in ()3.14l) . To this end, observe that 7/1 G IT 
and p 2 G W'^ so that this is an orthogonal decomposition. Moreover, using (jl.5p and noting that 
llr/ill = a, we have 

^ \\Pv:^h 2 \\ - ll-fV.rf/ill > P{Vj,W)\\p 2 \\ - a. (3.15) 

J 3 3 

We infer from Lemma 13.31 that the ball B with center at no = ^ (n + n 2 ) and radius 

8 j=0,l,...,n 3 

is contained in 1C. Let us suppose now that II 7 / 2 II > P and derive a contradiction. Then, we obtain 
from (|3.15l) 

ll-fVrJ-^ll > - ot> pj^pj{a + 4.y/a£j) - a = A^/aej , 

and thus 

1 

r > - min e ■ IGae,' = 2a. 

On the other hand, note that ||no — P'Hw'^oW = |||^ ~ P'Hw'^W — ^l‘^- Therefore, Pu^'^o ^ P- and 
hence in /C^. Moreover, 

Ik - P-H^nolP = + ^Ik2 - Ph^uW"^, 
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and 


\\u - U2f = + \\U2 - Pn^uf. 

If U 2 / -F’Wuj'w, we have ||m — Ph^uqW < ||it — n 2 || which is a contradiction since U 2 is the closest 
point to u in ICy^. If U2 — Ph.w'^ — 0 then 772 = 0 contradicting II 772 II > P- This completes the proof. □ 


One immediate consequence of the above lemma is an a posteriori error estimate for the squared 
distance to /C^ 

4 := dist(^^^,/C^„)^, 

in Algorithm I. Indeed, we have observed that G /C and therefore 

5k < Oil + p{otkf, ak-.= 


This ensures the following accuracy with respect to the unknown u G JCy,: 

\\u-u’^\\ < \Ja\ + p{akY + 2rad(/C^). 

If we have an a priori estimate for the Chebyshev radius of such as the bound from Theorem 
13.21 one possible stopping criterion is the validity of 

^Jal + piok)^ < En- 

This ensures that we have achieved accuracy ||tt — u^|| < 3£'„, however note that E^ can sometimes 
be a very pessimistic bound for rad(/C«,) so that significantly higher accuracy is reachable by more 
iterations. 

We can also use Lemma 13.41 to establish a convergence estimate for 4 in Algorithm 1. For this 
purpose, we introduce the intermediate iterates 

:= Ph^u\ 


and the corresponding squared distance 

4+i ■= dist(u *'+2 

Since the distance to Ky, is non-increasing in each projection steps, it follows that 

4+1 < <^fc+i =5k- Oil- 

On the other hand, it easily follows from Lemma 13.41 that 


4 - ai < p{akf < Aak, 

where A is a constant depending on e^’s, pj's and ||u||. It is easily seen that this implies the validity 
of the inequality 


Oik P "s/4 + A^/4 — A/2 > 


5k 


> 


5k 


\/ + 44 y/ A^ + 44 


■ — c5kj 
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and therefore 


(^fc+i <Sk — c^Sl- 


From this, one finds by indnction that 

6k < Ck~^, k >1, 


for a suitably chosen constant C := 


max{c taking into account that for any t > 1 


C 

J 




< C 



< 


c 

t + 1 


□ 


Remark 3.5 The above convergence rate 0{k~^/‘^) for the distance between and ICw is quite 
pessimistic, however, one can easily exhibit examples in which it indeed occurs due to the fact that 
TLw intersects K, at a single point of tangency. On the other hand, one can also easily find other 
examples for which convergence of Algorithm 1 is exponential. In particular, this occurs whenever 
JCu, has an element lying in the interior of 1C. 
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